Fast learning rates with heavy-tailed losses

نویسندگان

  • Vu C. Dinh
  • Lam Si Tung Ho
  • Binh T. Nguyen
  • Duy M. H. Nguyen
چکیده

We study fast learning rates when the losses are not necessarily bounded and may have a distribution with heavy tails. To enable such analyses, we introduce two new conditions: (i) the envelope function supf∈F |` ◦ f |, where ` is the loss function and F is the hypothesis class, exists and is L-integrable, and (ii) ` satisfies the multi-scale Bernstein’s condition on F . Under these assumptions, we prove that learning rate faster than O(n−1/2) can be obtained and, depending on r and the multi-scale Bernstein’s powers, can be arbitrarily close to O(n−1). We then verify these assumptions and derive fast learning rates for the problem of vector quantization by k-means clustering with heavy-tailed distributions. The analyses enable us to obtain novel learning rates that extend and complement existing results in the literature from both theoretical and practical viewpoints.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reduced-bias estimators for the Distortion Risk Premiums for Heavy-tailed distributions

Estimation of the occurrence of extreme events actually is that of risk premiums interest in actuarial Sciences, Insurance and Finance. Heavy-tailed distributions are used to model large claims and losses. In this paper we deal with the empirical estimation of the distortion risk premiums for heavy tailed losses by using the extreme value statistics. This approach can produce a potential bias i...

متن کامل

Fast Rates with Unbounded Losses

We present new excess risk bounds for randomized and deterministic estimators for general unbounded loss functions including log loss and squared loss. Our bounds are expressed in terms of the information complexity and hold under the recently introduced v-central condition, allowing for high-probability bounds, and its weakening, the v-pseudoprobability convexity condition, allowing for bounds...

متن کامل

Robust Estimation of Transition Matrices in High Dimensional Heavy-tailed Vector Autoregressive Processes

Gaussian vector autoregressive (VAR) processes have been extensively studied in the literature. However, Gaussian assumptions are stringent for heavy-tailed time series that frequently arises in finance and economics. In this paper, we develop a unified framework for modeling and estimating heavy-tailed VAR processes. In particular, we generalize the Gaussian VAR model by an elliptical VAR mode...

متن کامل

Heavy-Tailed Symmetric Stochastic Neighbor Embedding

Stochastic Neighbor Embedding (SNE) has shown to be quite promising for data visualization. Currently, the most popular implementation, t-SNE, is restricted to a particular Student t-distribution as its embedding distribution. Moreover, it uses a gradient descent algorithm that may require users to tune parameters such as the learning step size, momentum, etc., in finding its optimum. In this p...

متن کامل

Efficient Max-Margin Learning in Laplacian MRFs for Monocular Depth Estimation

While designing a Markov Random Field, especially one with continuous states such as in the task of depth estimation, one is presented with the modeling choice of what distribution to use. While distributions such as Gaussian are easy to work with, because of tractable inference and learning, they often do not model the data well. In particular, the statistics of natural images are heavy-tailed...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016